Mispronunciation Detection Leveraging Maximum Performance Criterion Training of Acoustic Models and Decision Functions

نویسندگان

Yao-Chi Hsu

Ming-Han Yang

Hsiao-Tsung Hung

Berlin Chen

چکیده

Mispronunciation detection is part and parcel of a computer assisted pronunciation training (CAPT) system, facilitating second-language (L2) learners to pinpoint erroneous pronunciations in a given utterance so as to improve their spoken proficiency. This paper presents a continuation of such a general line of research and the major contributions are twofold. First, we present an effective training approach that estimates the deep neural network based acoustic models involved in the mispronunciation detection process by optimizing an objective directly linked to the ultimate evaluation metric. Second, along the same vein, two disparate logistic sigmoid based decision functions with either phoneor senone-dependent parameterization are also inferred and used for enhanced mispronunciation detection. A series of experiments on a Mandarin mispronunciation detection task seem to show the performance merits of the proposed method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum F1-Score Discriminative Training for Automatic Mispronunciation Detection in Computer-Assisted Language Learning

In this paper, we propose and evaluate a novel discriminative training criterion for hidden Markov model (HMM) based automatic mispronunciation detection in computer-assisted pronunciation training. The objective function is formulated as a smooth form of the F1-score on the annotated non-native speech database. The objective function maximization is achieved by using extended Baum Welch form l...

متن کامل

Evaluation Metric-related Optimization Methods for Mandarin Mispronunciation Detection

Mispronunciation detection and diagnosis are part and parcel of a computer assisted pronunciation training (CAPT) system, collectively facilitating second-language (L2) learners to pinpoint erroneous pronunciations in a given utterance so as to improve their spoken proficiency. This thesis presents a continuation of such a general line of research and the major contributions are three-fold. Fir...

متن کامل

Vowel mispronunciation detection using DNN acoustic models with cross-lingual training

We address the automatic detection of phone-level mispronunciation for feedback in a computer-aided language learning task where the target language data (Indian English) is limited. Based on the recent success of DNN acoustic models on limited resource recognition tasks, we compare different methods of utilizing the limited target language data in the training of acoustic models that are initi...

متن کامل

Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers

Mispronunciation detection is an important part in a Computer-Aided Language Learning (CALL) system. By automatically pointing out where mispronunciations occur in an utterance, a language learner can receive informative and to-the-point feedbacks. In this paper, we improve mispronunciation detection performance with a Deep Neural Network (DNN) trained acoustic model and transfer learning based...

متن کامل

Outlier detection for acoustic model training using robust statistics

In this paper, we propose an acoustic model training technique which is robust against outliers such as clipping, unexpected noise, poorly pronounced word segments, or mistranscriptions, which deteriorate the quality of the acoustic models and in turn decrease speech recognition performance. The outlier-robust acoustic model training technique is based on a maximum likelihood (ML) criterion and...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Mispronunciation Detection Leveraging Maximum Performance Criterion Training of Acoustic Models and Decision Functions

نویسندگان

چکیده

منابع مشابه

Maximum F1-Score Discriminative Training for Automatic Mispronunciation Detection in Computer-Assisted Language Learning

Evaluation Metric-related Optimization Methods for Mandarin Mispronunciation Detection

Vowel mispronunciation detection using DNN acoustic models with cross-lingual training

Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers

Outlier detection for acoustic model training using robust statistics

عنوان ژورنال:

اشتراک گذاری